190 research outputs found

    Variance-Reduced and Projection-Free Stochastic Optimization

    Full text link
    The Frank-Wolfe optimization algorithm has recently regained popularity for machine learning applications due to its projection-free property and its ability to handle structured constraints. However, in the stochastic learning setting, it is still relatively understudied compared to the gradient descent counterpart. In this work, leveraging a recent variance reduction technique, we propose two stochastic Frank-Wolfe variants which substantially improve previous results in terms of the number of stochastic gradient evaluations needed to achieve 1ϵ1-\epsilon accuracy. For example, we improve from O(1ϵ)O(\frac{1}{\epsilon}) to O(ln1ϵ)O(\ln\frac{1}{\epsilon}) if the objective function is smooth and strongly convex, and from O(1ϵ2)O(\frac{1}{\epsilon^2}) to O(1ϵ1.5)O(\frac{1}{\epsilon^{1.5}}) if the objective function is smooth and Lipschitz. The theoretical improvement is also observed in experiments on real-world datasets for a multiclass classification application
    corecore